Implicitly-Multithreaded Processors
نویسندگان
چکیده
This paper proposes the Implicitly-MultiThreaded (IMT) architecture to execute compiler-specified speculative threads on to a modified Simultaneous Multithreading pipeline. IMT reduces hardware complexity by relying on the compiler to select suitable thread spawning points and orchestrate inter-thread register communication. To enhance IMT’s effectiveness, this paper proposes three novel microarchitectural mechanisms: (1) resourceand dependence-based fetch policy to fetch and execute suitable instructions, (2) context multiplexing to improve utilization and map as many threads to a single context as allowed by availability of resources, and (3) early threadinvocation to hide thread start-up overhead by overlapping one thread’s invocation with other threads’ execution. We use SPEC2K benchmarks and cycle-accurate simulation to show that an microarchitecture-optimized IMT improves performance on average by 24% and at best by 69% over an aggressive superscalar. We also compare IMT to two prior proposals, TME and DMT, for speculative threading on an SMT using hardware-extracted threads. Our best IMT design outperforms a comparable TME and DMT on average by 26% and 38% respectively.
منابع مشابه
Dynamic tiling for effective use of shared caches on multithreaded processors
Simultaneous multithreaded (SMT) processors use data caches which are dynamically shared between threads. Depending on the processor workload, sharing the data cache may harm performance due to excessive cache conflicts. A way to overcome this problem is to physically partition the cache between threads. Unfortunately, partitioning the cache requires additional hardware and may lead to lower ut...
متن کاملMeasuring the Performance of Multithreaded Processors
Nowadays, multithreaded architectures are becoming more and more popular. In fact, many processor vendors have already shipped processors with multithreaded features. Regardless of this push on multithreaded processors, still today there is not a clear procedure that defines how to measure the behavior of a multithreaded processor. This paper presents FAME, a new evaluation methodology aimed to...
متن کاملExploiting Thread-Level Parallelism on Simultaneous Multithreaded Processors
Exploiting Thread-Level Parallelism on Simultaneous Multithreaded Processors
متن کاملOperating System Scheduling for Chip Multithreaded Processors
This dissertation addresses operating system thread scheduling for chip multithreaded processors. Chip multithreaded processors are becoming mainstream thanks to their superior performance and power characteristics. Threads running concurrently on a chip multithreaded processor share the processor’s resources. Resource contention, and accordingly performance, depends on characteristics of the c...
متن کاملFunctional Unit Usage Based Thread Selection in a Simultaneous Multithreaded Processor
This paper proposes and evaluates a new mechanism for thread selection in simultaneous multithreaded processors that is based on functional unit(FU) usage information. The performance of any processor depends on the set of dependences that it can manage. In a multithreaded architecture there is an opportunity to manage structural dependences more effectively than in conventional superscalar pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003